Computational explanation of "fiction text effectivity" for vocabulary improvement: Corpus analyses using latent semantic analysis

نویسندگان

Keisuke Inohara

Akira Utsumi

چکیده

Previous studies have suggested that fiction book reading has a stronger positive effect on vocabulary development than nonfiction. In this study, we examined this phenomenon in terms of word appearance information in fiction (story texts), nonfiction (explanation texts), and web text using latent semantic analysis (LSA). In a human experiment with Japanese undergraduates, we replicated fiction (story) text effectivity. Participants who often read story texts achieved the highest vocabulary test scores. Then, in a corpus experiment, we constructed a story text corpus, explanation text corpus, and web text corpus of identical size. Based on these corpora, we calculated the LSA similarities between words, and simulated answering the same vocabulary test as used in the human experiment. The corpus experiment demonstrated the nonfiction (explanation) text effectively, that is, the explanation corpus was the highest. The cause of discrepancy in the results and the educational implications of this study were also discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Gutenberg English Poetry Corpus: Exemplary Quantitative Narrative Analyses

This paper describes a corpus of about 3,000 English literary texts with about 250 million words extracted from the Gutenberg project that span a range of genres from both fiction and non-fiction written by more than 130 authors (e.g., Darwin, Dickens, Shakespeare). Quantitative narrative analysis (QNA) is used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus (GEPC), which co...

متن کامل

How Important Is Size? An Investigation of Corpus Size and Meaning in Both Latent Semantic Analysis and Latent Dirichlet Allocation

This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilis...

متن کامل

Discovering objects and their location in images with Latent Dirichlet Allocation

We seek to discover object categories and their locations in a set of unlabelled images. We achieve this using probabilistic models developed in the text understanding community to discover interesting topics in a corpus of text documents. We hope that the application of these models to a set of images will discover visual topics corresponding to object categories. We show how to form the visua...

متن کامل

Explorations in an English Poetry Corpus: A Neurocognitive Poetics Perspective

This paper describes a corpus of about 3000 English literary texts with about 250 million words extracted from the Gutenberg project that span a range of genres from both fiction and non-fiction written by more than 130 authors (e.g., Darwin, Dickens, Shakespeare). Quantitative Narrative Analysis (QNA) is used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus (GEPC) which comp...

متن کامل

Situation and Text: Representation of Migrants Whilst the Escalation of Refugee Crisis in Great Britain as Compared to Russia

Increasing migration is a vital concern for a globalizing sociocultural environment in today’s world. The UK and developed European countries have become an attractive destination for asylum seekers (labelled as “migrants”) in the past decade. The rapid rise in the number of asylum seekers, which was labelled “migration crisis” (Ruz, 2015), made this topic an integral part of scientific discuss...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Computational explanation of "fiction text effectivity" for vocabulary improvement: Corpus analyses using latent semantic analysis

نویسندگان

چکیده

منابع مشابه

The Gutenberg English Poetry Corpus: Exemplary Quantitative Narrative Analyses

How Important Is Size? An Investigation of Corpus Size and Meaning in Both Latent Semantic Analysis and Latent Dirichlet Allocation

Discovering objects and their location in images with Latent Dirichlet Allocation

Explorations in an English Poetry Corpus: A Neurocognitive Poetics Perspective

Situation and Text: Representation of Migrants Whilst the Escalation of Refugee Crisis in Great Britain as Compared to Russia

عنوان ژورنال:

اشتراک گذاری